SlideShare a Scribd company logo
1 of 45
Windows Azure Tables and Queues Deep Dive Ing. Eduardo Castro, PhD Comunidad Windows ecastro@grupoasesor.net http://ecastrom.blogspot.com
Agenda Overview of Windows Azure Tables  Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q&A 2
Fundamental Storage Abstractions Tables– Provide structured storage.  A Table is a set of entities, which contain a set of properties Queues– Provide reliable storage and delivery of messages for an application Blobs – Provide a simple interface for storing named files along with metadata for the file Drives – Provides durable NTFS volumes for Windows Azure applications to use (new) 3
Windows Azure Tables Provides Structured Storage Massively Scalable Tables Billions of entities (rows) and TBs of data Can use thousands of servers as traffic grows Highly Available & Durable Data is replicated several times Familiar and Easy to use API ADO.NET Data Services – .NET 3.5 SP1 .NET classes and LINQ REST – with any platform or language 4
Table Storage Concepts Entities Tables Accounts Email  =… Name = … Users Email  =… Name = … moviesonline Genre =… Title    = … Movies Genre =… Title    = … 5
Table Data Model Table A storage account can create many tables Table name is scoped by account Set of entities (i.e. rows) Entity Set of properties (columns) Required properties PartitionKey, RowKey and Timestamp 6
Required Entity Properties PartitionKey & RowKey Uniquely identifies an entity Defines the sort order Use them to scale your application Timestamp  Read only Optimistic Concurrency 7
PartitionKey And Partitions PartitionKey Used to group entities in the table into partitions A table partition  All entities with same partition key value Unit of scale Control entity locality Row key provides uniqueness within a partition 8
Partitions and Partition Ranges Server A Table = Movies [Action - Comedy) Server A Table = Movies Server B Table = Movies [Comedy-  Western) 9
Table Operations Table Create Query Delete Entities Insert Update  Merge – Partial Update Replace – Update entire entity Delete Query Entity Group Transaction (new)
Table Schema Define the schema as a .NET class [DataServiceKey("PartitionKey", "RowKey")] publicclassMovie     { ///<summary> /// Category is the partition key ///</summary> publicstringPartitionKey { get; set; } ///<summary> /// Title is the row key ///</summary> publicstringRowKey { get; set; } publicDateTime Timestamp { get; set; }        publicintReleaseYear { get; set; } publicstring Language { get; set; } publicstring Cast { get; set; }     } 11
Table SDK Sample Code StorageCredentialsAccountAndKeycredentials = newStorageCredentialsAccountAndKey( “myaccount",  “myKey"); stringbaseUri= "http://myaccount.table.core.windows.net"; CloudTableClienttableClient = newCloudTableClient(baseUri, credentials); tableClient.CreateTable(“Movies"); TableServiceContextcontext = tableClient.GetDataServiceContext(); CloudTableQuery<Movie> q = (from movie incontext.CreateQuery<Movie>(“Movies")  	wheremovie.PartitionKey == “Action" && movie.RowKey== "The Bourne Ultimatum" 	selectmovie).AsTableServiceQuery<Movie>(); MoviemovieToUpdate = q.FirstOrDefault(); // Update movie context.UpdateObject(movieToUpdate); context.SaveChangesWithRetries(); //Add movie context.AddObject(new Movie(“Action" , movieToAdd)); context.SaveChangesWithRetries(); 12
Agenda Overview of Windows Azure Tables  Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q & A 13
Key Selection: Things to Consider Scalability Distribute load as much as possible Hot partitions can be load balanced PartitionKeyis critical for scalability Query Efficiency & Speed Avoid frequent large scans Parallelize queries Entity group transactions (new) Transactions across a single partition Transaction semantics & Reduce round trips 14
Key Selection: Case Study 1 Table for listing all movies Home page lists movies based on chosen category 15
Movie Listing – Solution 1 Why do I need multiple PartitionKeys? Account name as Partition Key   Movie title  as RowKey since movie names need to be sorted Category as a separate property Does this scale? 16
Movie Listing – Solution 1 Single partition - Entire table served by one server All requests served by that single server Does not scale Client Client Request Request Request Request Server A 17
Movie Listing – Solution 2 All movies partitioned by category Allows system to load balance hot partitions Load distributed Better than single partition Server A Client Client Request Request Request Request Request Request Request Request Server B 18
Key Selection: Case Study 2 Log every transaction into a table for diagnostics Scale Write Intensive Scenario Logs can be retrieved for a given time range 19
Logging - Solution 1 Timestamp as Partition Key  Looks like an obvious choice It is not a single partition as time moves forward  Append only Requests to single partition range Load balancingdoesnot help Server may throttle Server A Applications Client Server B Request Request Request Request 20
Logging Solution 2 - Distribute "Append Only” Prefix timestamp such that load is distributed Id of the node logging Hash into N buckets Write load is now distributed  Better throughput To query logs in time range Parallelize it across prefix values Server A Applications Client Server B Request Request Request Request 21
Key Selection: Query Efficiency & Speed Select keys that allow fast retrieval Reduce scan range Reduce scan frequency 22
Single Entity Query Where PartitionKey=‘SciFi’ and  RowKey = ‘Star Trek’ Efficient processing No continuation tokens Server A Client Request Server B Result 23
Table Scan Query Select * from Movies where Rating > 4 Returns Continuation token 1000 movies in result set Partition range boundary Serial Processing: Wait for      continuation  token before      proceeding Returns 1000  movies Partition range boundary hit Server A Cont. Cont. Return continuation Client Request Request Cont. Request Cont. Server B Cont. 24
Make Scans Faster Split “Select * from Movies where Rating > 4” into Where PartitionKey >= “A” and PartitionKey < “D” and Rating > 4 Where PartitionKey >= “D” and PartitionKey < “I” and Rating > 4 Etc. Execute in parallel Each query handles continuation Server A Cont. Cont. Request Client Request Request Server B Cont. 25
Query Speed Fast Single PartitionKey and RowKey with equality Medium Single partition but a small range for RowKey Entire partition or table that is small Slow Large single scan Large table scan “OR” predicates on keys => no query optimization => results in scan Expect continuation token for all except in 1 26
Make Queries Faster Large Scans Split the range and parallelize queries Create and maintain own views that help queries “Or” Predicates Execute individual query in parallel instead of using “OR” User Interactive Cache the result to reduce scan frequency 27
Expect Continuation Tokens – Seriously! Maximum of 1000 rows in a response At the end of partition range boundary Maximum of 5 seconds to execute the query 28
Entity Group Transactions (EGT) (new) Atomically perform multiple insert/update/deleteover entities in same partition in a single transaction Maximum of 100 commands in a single transaction and payload < 4 MB ADO.Net Data Service Use SaveChangesOptions.Batch 29
Key Selection: Entity Group Transaction Case Study Maintain user account information Account ID, User Name, Address, Number of rentals Maintain information of checked out rentals Account ID, Movie Title, Check out date, Due date Solution 1 – Maintain two tables – Users & Rentals  Handle Cross table consistency Insert into Rentals table succeeds Update to Users table fails Queue to maintain consistency 30
Solution 2 Store Account Information and Rental details in same table Maintain same PartitionKey to enforce transactions Account ID as  PartitionKey Update total count and Insert new rentals using Entity Group Transaction Prefix RowKey with “Kind” code: A = Account, R = Rental Row key for account info: [Kind Code]_[AccountId] Row Key for rental info: [Kind Code]_[Title] Rental Properties not set for Account row and vice versa 31
Best Practices & Summary Select PartitionKey and RowKey that help scale Efficient for frequently used queries Supports batch transactions Distributes load Distribute “Append only” patterns using prefix to PartitionKey Always Handle continuation tokens Client can maintain their own cache/views instead of frequent scans Future Feature - Secondary Index Execute parallel queries instead of “OR” predicates Implement back-off strategy for retries 32
Agenda Overview of Windows Azure Tables  Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q & A 33
Windows Azure Queues Queue are performance efficient, highly available and provide reliable message delivery Simple, asynchronous work dispatch Programming semantics ensure that a message can be processed at least once Access is provided via REST 34
Queue Storage Concepts Messages Queues Accounts 128 x 128 http://... thumbnailjobs 256 x 256 http://... sally http://... traverselinks http://... 35
Account, Queues and Messages An account can create many queues Queue Name is scoped by the account A Queue contains messages No limit on number of messages stored in a queue Set a limit for message expiration Messages Message size  <= 8 KB To store larger data, store data in blob/entity storage, and the blob/entity name in the message Message now has dequeue count 36
Queue Operations Queue Create Queue Delete Queue List Queues Get/Set Queue Metadata Messages Add Message (i.e. Enqueue Message) Get Message(s) (i.e. Dequeue Message) Peek Message(s) Delete Message 37
Queue Programming Api CloudQueueClientqueueClient = newCloudQueueClient(baseUri, credentials); CloudQueuequeue = queueClient.GetQueueReference("test1"); queue.CreateIfNotExist(); //MessageCountis populated via FetchAttributes queue.FetchAttributes(); CloudQueueMessagemessage = newCloudQueueMessage("Some content"); queue.AddMessage(message); message = queue.GetMessage(TimeSpan.FromMinutes(10) /*visibility timeout*/); //Process the message here … queue.DeleteMessage(message); 38
Agenda Overview of Windows Azure Tables  Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q & A 39
Removing Poison Messages Producers Consumers C1 P2 1. GetMessage(Q, 30 s)  msg 1 2 1 1 1 1 1 2 1 3 4 0 3 0 1 1 2 1 1 0 2 0 C2 P1 2. GetMessage(Q, 30 s)  msg 2 40
Removing Poison Messages Producers Consumers 1 1 C1 P2 1. GetMessage(Q, 30 s)  msg 1 5. C1 crashed 4 0 1 1 2 1 3 0 1 2 1 1 1 2 3 6. msg1 visible 30 s after Dequeue 2 1 C2 P1 2. GetMessage(Q, 30 s)  msg 2 3. C2 consumed msg 2 4. DeleteMessage(Q, msg 2) 7. GetMessage(Q, 30 s)  msg 1 41
Removing Poison Messages Producers Consumers 1. Dequeue(Q, 30 sec)  msg 1 5. C1 crashed 10.  C1 restarted 11.  Dequeue(Q, 30 sec)  msg 1 12.  DequeueCount > 2 13.  Delete (Q, msg1) C1 P2 4 0 1 2 3 0 1 3 1 2 1 3 3 1 2 C2 P1 6. msg1 visible 30s after Dequeue 9. msg1 visible 30s after Dequeue 2. Dequeue(Q, 30 sec)  msg 2 3. C2 consumed msg 2 4. Delete(Q, msg 2) 7. Dequeue(Q, 30 sec)  msg1 8. C2 crashed 42
Best Practices & Summary Make message processing idempotent No need to deal with failures Do not rely on order Invisible messages result in out of order Use Dequeue count to remove poison messages Enforce threshold on message’s dequeue count Use message count to dynamically increase/reduce workers Use blob to store message data with reference in message Messages > 8KB Batch messages Garbage collect orphaned blobs 43
Summary Table Scalable & Reliable Structured Storage System Partitioning is critical to scalability Entity Group Transactions (new) Queue Scalable & Reliable Messaging System  Dequeue count returned with message (new) Use back-off strategy on retries Official Storage Client Library (new) 44
Links http://comunidadwindows.org http://ecastrom.blogspot.com http://www.sqlazurelabs.com http://www.microsoft.com/windowsazure/ http://sql.azure.com/

More Related Content

What's hot

Stream Application Development with Apache Kafka
Stream Application Development with Apache KafkaStream Application Development with Apache Kafka
Stream Application Development with Apache KafkaMatthias J. Sax
 
Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry DataWorks Summit/Hadoop Summit
 
Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019Evan Chan
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streamingphanleson
 
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...DataWorks Summit
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Databricks
 
Cost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveCost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveJulian Hyde
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 
How to performance tune spark applications in large clusters
How to performance tune spark applications in large clustersHow to performance tune spark applications in large clusters
How to performance tune spark applications in large clustersOmkar Joshi
 
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Amazon Web Services
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkDatabricks
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector? confluent
 
SRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftSRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftAmazon Web Services
 
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...Amazon Web Services
 
I/O & virtualization performance with a search engine based on an xml databa...
 I/O & virtualization performance with a search engine based on an xml databa... I/O & virtualization performance with a search engine based on an xml databa...
I/O & virtualization performance with a search engine based on an xml databa...lucenerevolution
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Julian Hyde
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house queryCristinaMunteanu43
 

What's hot (20)

Stream Application Development with Apache Kafka
Stream Application Development with Apache KafkaStream Application Development with Apache Kafka
Stream Application Development with Apache Kafka
 
Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry
 
Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019Histograms at scale - Monitorama 2019
Histograms at scale - Monitorama 2019
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streaming
 
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Cost-based query optimization in Apache Hive
Cost-based query optimization in Apache HiveCost-based query optimization in Apache Hive
Cost-based query optimization in Apache Hive
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
How to performance tune spark applications in large clusters
How to performance tune spark applications in large clustersHow to performance tune spark applications in large clusters
How to performance tune spark applications in large clusters
 
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
Redshift at Lightspeed: How to continuously optimize and modify Redshift sche...
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector?
 
SRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftSRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon Redshift
 
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
 
I/O & virtualization performance with a search engine based on an xml databa...
 I/O & virtualization performance with a search engine based on an xml databa... I/O & virtualization performance with a search engine based on an xml databa...
I/O & virtualization performance with a search engine based on an xml databa...
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
 

Similar to Tablas y almacenamiento en windows azure

Exploring Windows Azure Cloud Storage
Exploring Windows Azure Cloud StorageExploring Windows Azure Cloud Storage
Exploring Windows Azure Cloud StorageK.Mohamed Faizal
 
Exploring azure cloud storage
Exploring azure cloud storageExploring azure cloud storage
Exploring azure cloud storageSpiffy
 
Microsoft Database Options
Microsoft Database OptionsMicrosoft Database Options
Microsoft Database OptionsDavid Chou
 
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMESet your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMEconfluent
 
Windows Azure Acid Test
Windows Azure Acid TestWindows Azure Acid Test
Windows Azure Acid Testexpanz
 
Windows Azure and a little SQL Data Services
Windows Azure and a little SQL Data ServicesWindows Azure and a little SQL Data Services
Windows Azure and a little SQL Data Servicesukdpe
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
Introdução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftIntrodução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftAmazon Web Services LATAM
 
SQL Server 2008 for Developers
SQL Server 2008 for DevelopersSQL Server 2008 for Developers
SQL Server 2008 for Developersukdpe
 
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...Flink Forward
 
Saying goodbye to SQL Server 2000
Saying goodbye to SQL Server 2000Saying goodbye to SQL Server 2000
Saying goodbye to SQL Server 2000ukdpe
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architectureAjeet Singh
 
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de AplicaçõesWindows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de AplicaçõesComunidade NetPonto
 
Roles y Responsabilidades en SQL Azure
Roles y Responsabilidades en SQL AzureRoles y Responsabilidades en SQL Azure
Roles y Responsabilidades en SQL AzureEduardo Castro
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudScott Miao
 

Similar to Tablas y almacenamiento en windows azure (20)

Exploring Windows Azure Cloud Storage
Exploring Windows Azure Cloud StorageExploring Windows Azure Cloud Storage
Exploring Windows Azure Cloud Storage
 
Exploring azure cloud storage
Exploring azure cloud storageExploring azure cloud storage
Exploring azure cloud storage
 
Microsoft Database Options
Microsoft Database OptionsMicrosoft Database Options
Microsoft Database Options
 
Windows azure table storage – deep dive
Windows azure table storage – deep diveWindows azure table storage – deep dive
Windows azure table storage – deep dive
 
Sky High With Azure
Sky High With AzureSky High With Azure
Sky High With Azure
 
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMESet your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
 
Windows Azure Acid Test
Windows Azure Acid TestWindows Azure Acid Test
Windows Azure Acid Test
 
Windows Azure and a little SQL Data Services
Windows Azure and a little SQL Data ServicesWindows Azure and a little SQL Data Services
Windows Azure and a little SQL Data Services
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Introdução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon RedshiftIntrodução ao data warehouse Amazon Redshift
Introdução ao data warehouse Amazon Redshift
 
SQL Server 2008 for Developers
SQL Server 2008 for DevelopersSQL Server 2008 for Developers
SQL Server 2008 for Developers
 
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
 
Saying goodbye to SQL Server 2000
Saying goodbye to SQL Server 2000Saying goodbye to SQL Server 2000
Saying goodbye to SQL Server 2000
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Introduction To Cloud Computing
Introduction To Cloud ComputingIntroduction To Cloud Computing
Introduction To Cloud Computing
 
Test automation process
Test automation processTest automation process
Test automation process
 
Test automation process _ QTP
Test automation process _ QTPTest automation process _ QTP
Test automation process _ QTP
 
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de AplicaçõesWindows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
 
Roles y Responsabilidades en SQL Azure
Roles y Responsabilidades en SQL AzureRoles y Responsabilidades en SQL Azure
Roles y Responsabilidades en SQL Azure
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 

More from Eduardo Castro

Introducción a polybase en SQL Server
Introducción a polybase en SQL ServerIntroducción a polybase en SQL Server
Introducción a polybase en SQL ServerEduardo Castro
 
Creando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL ServerCreando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL ServerEduardo Castro
 
Seguridad en SQL Azure
Seguridad en SQL AzureSeguridad en SQL Azure
Seguridad en SQL AzureEduardo Castro
 
Azure Synapse Analytics MLflow
Azure Synapse Analytics MLflowAzure Synapse Analytics MLflow
Azure Synapse Analytics MLflowEduardo Castro
 
SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022Eduardo Castro
 
Novedades en SQL Server 2022
Novedades en SQL Server 2022Novedades en SQL Server 2022
Novedades en SQL Server 2022Eduardo Castro
 
Introduccion a SQL Server 2022
Introduccion a SQL Server 2022Introduccion a SQL Server 2022
Introduccion a SQL Server 2022Eduardo Castro
 
Machine Learning con Azure Managed Instance
Machine Learning con Azure Managed InstanceMachine Learning con Azure Managed Instance
Machine Learning con Azure Managed InstanceEduardo Castro
 
Novedades en sql server 2022
Novedades en sql server 2022Novedades en sql server 2022
Novedades en sql server 2022Eduardo Castro
 
Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022Eduardo Castro
 
Introduccion a databricks
Introduccion a databricksIntroduccion a databricks
Introduccion a databricksEduardo Castro
 
Pronosticos con sql server
Pronosticos con sql serverPronosticos con sql server
Pronosticos con sql serverEduardo Castro
 
Data warehouse con azure synapse analytics
Data warehouse con azure synapse analyticsData warehouse con azure synapse analytics
Data warehouse con azure synapse analyticsEduardo Castro
 
Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2Eduardo Castro
 
Introduccion a Azure Synapse Analytics
Introduccion a Azure Synapse AnalyticsIntroduccion a Azure Synapse Analytics
Introduccion a Azure Synapse AnalyticsEduardo Castro
 
Seguridad de SQL Database en Azure
Seguridad de SQL Database en AzureSeguridad de SQL Database en Azure
Seguridad de SQL Database en AzureEduardo Castro
 
Python dentro de SQL Server
Python dentro de SQL ServerPython dentro de SQL Server
Python dentro de SQL ServerEduardo Castro
 
Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft Eduardo Castro
 
Script de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure EnclavesScript de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure EnclavesEduardo Castro
 
Introducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure EnclavesIntroducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure EnclavesEduardo Castro
 

More from Eduardo Castro (20)

Introducción a polybase en SQL Server
Introducción a polybase en SQL ServerIntroducción a polybase en SQL Server
Introducción a polybase en SQL Server
 
Creando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL ServerCreando tu primer ambiente de AI en Azure ML y SQL Server
Creando tu primer ambiente de AI en Azure ML y SQL Server
 
Seguridad en SQL Azure
Seguridad en SQL AzureSeguridad en SQL Azure
Seguridad en SQL Azure
 
Azure Synapse Analytics MLflow
Azure Synapse Analytics MLflowAzure Synapse Analytics MLflow
Azure Synapse Analytics MLflow
 
SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022SQL Server 2019 con Windows Server 2022
SQL Server 2019 con Windows Server 2022
 
Novedades en SQL Server 2022
Novedades en SQL Server 2022Novedades en SQL Server 2022
Novedades en SQL Server 2022
 
Introduccion a SQL Server 2022
Introduccion a SQL Server 2022Introduccion a SQL Server 2022
Introduccion a SQL Server 2022
 
Machine Learning con Azure Managed Instance
Machine Learning con Azure Managed InstanceMachine Learning con Azure Managed Instance
Machine Learning con Azure Managed Instance
 
Novedades en sql server 2022
Novedades en sql server 2022Novedades en sql server 2022
Novedades en sql server 2022
 
Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022Sql server 2019 con windows server 2022
Sql server 2019 con windows server 2022
 
Introduccion a databricks
Introduccion a databricksIntroduccion a databricks
Introduccion a databricks
 
Pronosticos con sql server
Pronosticos con sql serverPronosticos con sql server
Pronosticos con sql server
 
Data warehouse con azure synapse analytics
Data warehouse con azure synapse analyticsData warehouse con azure synapse analytics
Data warehouse con azure synapse analytics
 
Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2Que hay de nuevo en el Azure Data Lake Storage Gen2
Que hay de nuevo en el Azure Data Lake Storage Gen2
 
Introduccion a Azure Synapse Analytics
Introduccion a Azure Synapse AnalyticsIntroduccion a Azure Synapse Analytics
Introduccion a Azure Synapse Analytics
 
Seguridad de SQL Database en Azure
Seguridad de SQL Database en AzureSeguridad de SQL Database en Azure
Seguridad de SQL Database en Azure
 
Python dentro de SQL Server
Python dentro de SQL ServerPython dentro de SQL Server
Python dentro de SQL Server
 
Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft Servicios Cognitivos de de Microsoft
Servicios Cognitivos de de Microsoft
 
Script de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure EnclavesScript de paso a paso de configuración de Secure Enclaves
Script de paso a paso de configuración de Secure Enclaves
 
Introducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure EnclavesIntroducción a conceptos de SQL Server Secure Enclaves
Introducción a conceptos de SQL Server Secure Enclaves
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Tablas y almacenamiento en windows azure

  • 1. Windows Azure Tables and Queues Deep Dive Ing. Eduardo Castro, PhD Comunidad Windows ecastro@grupoasesor.net http://ecastrom.blogspot.com
  • 2. Agenda Overview of Windows Azure Tables Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q&A 2
  • 3. Fundamental Storage Abstractions Tables– Provide structured storage. A Table is a set of entities, which contain a set of properties Queues– Provide reliable storage and delivery of messages for an application Blobs – Provide a simple interface for storing named files along with metadata for the file Drives – Provides durable NTFS volumes for Windows Azure applications to use (new) 3
  • 4. Windows Azure Tables Provides Structured Storage Massively Scalable Tables Billions of entities (rows) and TBs of data Can use thousands of servers as traffic grows Highly Available & Durable Data is replicated several times Familiar and Easy to use API ADO.NET Data Services – .NET 3.5 SP1 .NET classes and LINQ REST – with any platform or language 4
  • 5. Table Storage Concepts Entities Tables Accounts Email =… Name = … Users Email =… Name = … moviesonline Genre =… Title = … Movies Genre =… Title = … 5
  • 6. Table Data Model Table A storage account can create many tables Table name is scoped by account Set of entities (i.e. rows) Entity Set of properties (columns) Required properties PartitionKey, RowKey and Timestamp 6
  • 7. Required Entity Properties PartitionKey & RowKey Uniquely identifies an entity Defines the sort order Use them to scale your application Timestamp Read only Optimistic Concurrency 7
  • 8. PartitionKey And Partitions PartitionKey Used to group entities in the table into partitions A table partition All entities with same partition key value Unit of scale Control entity locality Row key provides uniqueness within a partition 8
  • 9. Partitions and Partition Ranges Server A Table = Movies [Action - Comedy) Server A Table = Movies Server B Table = Movies [Comedy- Western) 9
  • 10. Table Operations Table Create Query Delete Entities Insert Update Merge – Partial Update Replace – Update entire entity Delete Query Entity Group Transaction (new)
  • 11. Table Schema Define the schema as a .NET class [DataServiceKey("PartitionKey", "RowKey")] publicclassMovie { ///<summary> /// Category is the partition key ///</summary> publicstringPartitionKey { get; set; } ///<summary> /// Title is the row key ///</summary> publicstringRowKey { get; set; } publicDateTime Timestamp { get; set; } publicintReleaseYear { get; set; } publicstring Language { get; set; } publicstring Cast { get; set; } } 11
  • 12. Table SDK Sample Code StorageCredentialsAccountAndKeycredentials = newStorageCredentialsAccountAndKey( “myaccount", “myKey"); stringbaseUri= "http://myaccount.table.core.windows.net"; CloudTableClienttableClient = newCloudTableClient(baseUri, credentials); tableClient.CreateTable(“Movies"); TableServiceContextcontext = tableClient.GetDataServiceContext(); CloudTableQuery<Movie> q = (from movie incontext.CreateQuery<Movie>(“Movies") wheremovie.PartitionKey == “Action" && movie.RowKey== "The Bourne Ultimatum" selectmovie).AsTableServiceQuery<Movie>(); MoviemovieToUpdate = q.FirstOrDefault(); // Update movie context.UpdateObject(movieToUpdate); context.SaveChangesWithRetries(); //Add movie context.AddObject(new Movie(“Action" , movieToAdd)); context.SaveChangesWithRetries(); 12
  • 13. Agenda Overview of Windows Azure Tables Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q & A 13
  • 14. Key Selection: Things to Consider Scalability Distribute load as much as possible Hot partitions can be load balanced PartitionKeyis critical for scalability Query Efficiency & Speed Avoid frequent large scans Parallelize queries Entity group transactions (new) Transactions across a single partition Transaction semantics & Reduce round trips 14
  • 15. Key Selection: Case Study 1 Table for listing all movies Home page lists movies based on chosen category 15
  • 16. Movie Listing – Solution 1 Why do I need multiple PartitionKeys? Account name as Partition Key Movie title as RowKey since movie names need to be sorted Category as a separate property Does this scale? 16
  • 17. Movie Listing – Solution 1 Single partition - Entire table served by one server All requests served by that single server Does not scale Client Client Request Request Request Request Server A 17
  • 18. Movie Listing – Solution 2 All movies partitioned by category Allows system to load balance hot partitions Load distributed Better than single partition Server A Client Client Request Request Request Request Request Request Request Request Server B 18
  • 19. Key Selection: Case Study 2 Log every transaction into a table for diagnostics Scale Write Intensive Scenario Logs can be retrieved for a given time range 19
  • 20. Logging - Solution 1 Timestamp as Partition Key Looks like an obvious choice It is not a single partition as time moves forward Append only Requests to single partition range Load balancingdoesnot help Server may throttle Server A Applications Client Server B Request Request Request Request 20
  • 21. Logging Solution 2 - Distribute "Append Only” Prefix timestamp such that load is distributed Id of the node logging Hash into N buckets Write load is now distributed Better throughput To query logs in time range Parallelize it across prefix values Server A Applications Client Server B Request Request Request Request 21
  • 22. Key Selection: Query Efficiency & Speed Select keys that allow fast retrieval Reduce scan range Reduce scan frequency 22
  • 23. Single Entity Query Where PartitionKey=‘SciFi’ and RowKey = ‘Star Trek’ Efficient processing No continuation tokens Server A Client Request Server B Result 23
  • 24. Table Scan Query Select * from Movies where Rating > 4 Returns Continuation token 1000 movies in result set Partition range boundary Serial Processing: Wait for continuation token before proceeding Returns 1000 movies Partition range boundary hit Server A Cont. Cont. Return continuation Client Request Request Cont. Request Cont. Server B Cont. 24
  • 25. Make Scans Faster Split “Select * from Movies where Rating > 4” into Where PartitionKey >= “A” and PartitionKey < “D” and Rating > 4 Where PartitionKey >= “D” and PartitionKey < “I” and Rating > 4 Etc. Execute in parallel Each query handles continuation Server A Cont. Cont. Request Client Request Request Server B Cont. 25
  • 26. Query Speed Fast Single PartitionKey and RowKey with equality Medium Single partition but a small range for RowKey Entire partition or table that is small Slow Large single scan Large table scan “OR” predicates on keys => no query optimization => results in scan Expect continuation token for all except in 1 26
  • 27. Make Queries Faster Large Scans Split the range and parallelize queries Create and maintain own views that help queries “Or” Predicates Execute individual query in parallel instead of using “OR” User Interactive Cache the result to reduce scan frequency 27
  • 28. Expect Continuation Tokens – Seriously! Maximum of 1000 rows in a response At the end of partition range boundary Maximum of 5 seconds to execute the query 28
  • 29. Entity Group Transactions (EGT) (new) Atomically perform multiple insert/update/deleteover entities in same partition in a single transaction Maximum of 100 commands in a single transaction and payload < 4 MB ADO.Net Data Service Use SaveChangesOptions.Batch 29
  • 30. Key Selection: Entity Group Transaction Case Study Maintain user account information Account ID, User Name, Address, Number of rentals Maintain information of checked out rentals Account ID, Movie Title, Check out date, Due date Solution 1 – Maintain two tables – Users & Rentals Handle Cross table consistency Insert into Rentals table succeeds Update to Users table fails Queue to maintain consistency 30
  • 31. Solution 2 Store Account Information and Rental details in same table Maintain same PartitionKey to enforce transactions Account ID as PartitionKey Update total count and Insert new rentals using Entity Group Transaction Prefix RowKey with “Kind” code: A = Account, R = Rental Row key for account info: [Kind Code]_[AccountId] Row Key for rental info: [Kind Code]_[Title] Rental Properties not set for Account row and vice versa 31
  • 32. Best Practices & Summary Select PartitionKey and RowKey that help scale Efficient for frequently used queries Supports batch transactions Distributes load Distribute “Append only” patterns using prefix to PartitionKey Always Handle continuation tokens Client can maintain their own cache/views instead of frequent scans Future Feature - Secondary Index Execute parallel queries instead of “OR” predicates Implement back-off strategy for retries 32
  • 33. Agenda Overview of Windows Azure Tables Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q & A 33
  • 34. Windows Azure Queues Queue are performance efficient, highly available and provide reliable message delivery Simple, asynchronous work dispatch Programming semantics ensure that a message can be processed at least once Access is provided via REST 34
  • 35. Queue Storage Concepts Messages Queues Accounts 128 x 128 http://... thumbnailjobs 256 x 256 http://... sally http://... traverselinks http://... 35
  • 36. Account, Queues and Messages An account can create many queues Queue Name is scoped by the account A Queue contains messages No limit on number of messages stored in a queue Set a limit for message expiration Messages Message size <= 8 KB To store larger data, store data in blob/entity storage, and the blob/entity name in the message Message now has dequeue count 36
  • 37. Queue Operations Queue Create Queue Delete Queue List Queues Get/Set Queue Metadata Messages Add Message (i.e. Enqueue Message) Get Message(s) (i.e. Dequeue Message) Peek Message(s) Delete Message 37
  • 38. Queue Programming Api CloudQueueClientqueueClient = newCloudQueueClient(baseUri, credentials); CloudQueuequeue = queueClient.GetQueueReference("test1"); queue.CreateIfNotExist(); //MessageCountis populated via FetchAttributes queue.FetchAttributes(); CloudQueueMessagemessage = newCloudQueueMessage("Some content"); queue.AddMessage(message); message = queue.GetMessage(TimeSpan.FromMinutes(10) /*visibility timeout*/); //Process the message here … queue.DeleteMessage(message); 38
  • 39. Agenda Overview of Windows Azure Tables Patterns and Practices for Windows Azure Tables Overview of Windows Azure Queues Patterns and Practices for Windows Azure Queues Q & A 39
  • 40. Removing Poison Messages Producers Consumers C1 P2 1. GetMessage(Q, 30 s)  msg 1 2 1 1 1 1 1 2 1 3 4 0 3 0 1 1 2 1 1 0 2 0 C2 P1 2. GetMessage(Q, 30 s)  msg 2 40
  • 41. Removing Poison Messages Producers Consumers 1 1 C1 P2 1. GetMessage(Q, 30 s)  msg 1 5. C1 crashed 4 0 1 1 2 1 3 0 1 2 1 1 1 2 3 6. msg1 visible 30 s after Dequeue 2 1 C2 P1 2. GetMessage(Q, 30 s)  msg 2 3. C2 consumed msg 2 4. DeleteMessage(Q, msg 2) 7. GetMessage(Q, 30 s)  msg 1 41
  • 42. Removing Poison Messages Producers Consumers 1. Dequeue(Q, 30 sec)  msg 1 5. C1 crashed 10. C1 restarted 11. Dequeue(Q, 30 sec)  msg 1 12. DequeueCount > 2 13. Delete (Q, msg1) C1 P2 4 0 1 2 3 0 1 3 1 2 1 3 3 1 2 C2 P1 6. msg1 visible 30s after Dequeue 9. msg1 visible 30s after Dequeue 2. Dequeue(Q, 30 sec)  msg 2 3. C2 consumed msg 2 4. Delete(Q, msg 2) 7. Dequeue(Q, 30 sec)  msg1 8. C2 crashed 42
  • 43. Best Practices & Summary Make message processing idempotent No need to deal with failures Do not rely on order Invisible messages result in out of order Use Dequeue count to remove poison messages Enforce threshold on message’s dequeue count Use message count to dynamically increase/reduce workers Use blob to store message data with reference in message Messages > 8KB Batch messages Garbage collect orphaned blobs 43
  • 44. Summary Table Scalable & Reliable Structured Storage System Partitioning is critical to scalability Entity Group Transactions (new) Queue Scalable & Reliable Messaging System Dequeue count returned with message (new) Use back-off strategy on retries Official Storage Client Library (new) 44
  • 45. Links http://comunidadwindows.org http://ecastrom.blogspot.com http://www.sqlazurelabs.com http://www.microsoft.com/windowsazure/ http://sql.azure.com/